Algorithmic dimensionality reduction for molecular structure analysis.
نویسندگان
چکیده
Dimensionality reduction approaches have been used to exploit the redundancy in a Cartesian coordinate representation of molecular motion by producing low-dimensional representations of molecular motion. This has been used to help visualize complex energy landscapes, to extend the time scales of simulation, and to improve the efficiency of optimization. Until recently, linear approaches for dimensionality reduction have been employed. Here, we investigate the efficacy of several automated algorithms for nonlinear dimensionality reduction for representation of trans, trans-1,2,4-trifluorocyclo-octane conformation--a molecule whose structure can be described on a 2-manifold in a Cartesian coordinate phase space. We describe an efficient approach for a deterministic enumeration of ring conformations. We demonstrate a drastic improvement in dimensionality reduction with the use of nonlinear methods. We discuss the use of dimensionality reduction algorithms for estimating intrinsic dimensionality and the relationship to the Whitney embedding theorem. Additionally, we investigate the influence of the choice of high-dimensional encoding on the reduction. We show for the case studied that, in terms of reconstruction error root mean square deviation, Cartesian coordinate representations and encodings based on interatom distances provide better performance than encodings based on a dihedral angle representation.
منابع مشابه
2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملICML 2010 Tutorial: Geometric Tools for Identifying Structure in Large Social and Information Networks
The tutorial will cover recent algorithmic and statistical work on identifying and exploiting “geometric” structure in large informatics graphs such as large social and information networks. Such tools (e.g., Principal Component Analysis and related non-linear dimensionality reduction methods) are popular in many areas of machine learning and data analysis due to their relatively-nice algorithm...
متن کاملA survey of dimensionality reduction techniques based on random projection
Dimensionality reduction techniques play important roles in the analysis of big data. Traditional dimensionality reduction approaches, such as principle component analysis (PCA) and linear discriminant analysis (LDA), have been studied extensively in the past few decades. However, as the dimensionality of data increases, the computational cost of traditional dimensionality reduction methods gro...
متن کاملParameter-free Network Sparsification and Data Reduction by Minimal Algorithmic Information Loss
The study of large and complex datasets, or big data, organized as networks has emerged as one of the central challenges in most areas of science and technology. Cellular and molecular networks in biology is one of the prime examples. Henceforth, a number of techniques for data dimensionality reduction, especially in the context of networks, have been developed. Yet, current techniques require ...
متن کاملKernel Methods for Nonlinear Discriminative Data Analysis
Optimal Component Analysis (OCA) is a linear subspace technique for dimensionality reduction designed to optimize object classification and recognition performance. The linear nature of OCA often limits recognition performance, if the underlying data structure is nonlinear or cluster structures are complex. To address these problems, we investigate a kernel analogue of OCA, which consists of ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- The Journal of chemical physics
دوره 129 6 شماره
صفحات -
تاریخ انتشار 2008